feat(serve): improve OpenAI API compatibility with usage, finish_reas… by markstur · Pull Request #771 · generative-computing/mellea

markstur · 2026-04-01T04:04:56Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Part of Create OpenAI API-compatible HTTP interface for mellea #521

feat(serve): improve OpenAI API compatibility with usage, finish_reason, and system_fingerprint

Add missing OpenAI API fields to m serve chat completion responses:

Add finish_reason field to Choice model (defaults to "stop")
Add usage field with token counts (prompt_tokens, completion_tokens, total_tokens)
extracted from ModelOutputThunk.usage when available
Add system_fingerprint field populated from model or provider metadata
Fix bug in model_options extraction (use model_dump().items() instead of iterating request)

Includes comprehensive test coverage with 13 unit tests verifying all new fields
and edge cases (missing data, partial data, fallback behavior).

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

…on, and system_fingerprint Add missing OpenAI API fields to m serve chat completion responses: - Add `finish_reason` field to Choice model (defaults to "stop") - Add `usage` field with token counts (prompt_tokens, completion_tokens, total_tokens) extracted from ModelOutputThunk.usage when available - Add `system_fingerprint` field populated from model or provider metadata - Fix bug in model_options extraction (use model_dump().items() instead of iterating request) Includes comprehensive test coverage with 13 unit tests verifying all new fields and edge cases (missing data, partial data, fallback behavior). Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

github-actions · 2026-04-01T04:05:08Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

ajbozarth · 2026-04-01T20:12:23Z

I used Claude to do a PR review, here are a few highlight you may want to look into:

Issues

system_fingerprint semantics (design concern)
The PR populates system_fingerprint with output.model (e.g., "gpt-4"). In OpenAI's spec, system_fingerprint is a hash/fingerprint of backend config, not the model name. The model name is already in response.model. This
conflates two different fields.

Options: use None always (cleanest), derive a real fingerprint hash, or rename/repurpose the field with a comment noting the divergence from spec.

Partial usage data: total_tokens silently wrong
In test_usage_with_partial_data, when only prompt_tokens: 10 is provided:

assert response.usage.total_tokens == 0  # Should default to 0

The comment says "should default to 0" but 0 is an incorrect total. It should be prompt_tokens + completion_tokens at minimum. This is a silent bad value going out over the API.

Unused patch import in test file
from unittest.mock import Mock, patch # patch never used
Ruff will catch this but worth fixing before it fails the pre-commit hook.

Minor / Nits

test_completion_id_format asserts len(response.id) == 38 — brittle if ID generation changes. len(response.id) > len("chatcmpl-") or a regex would be more robust.
patch unused import will fail ruff check.

system_fingerprint should be a backend config hash, not the model name. The model name is already in response.model. Set to None since we don't currently track backend config fingerprints. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

When partial usage data is provided (e.g., only prompt_tokens), total_tokens was incorrectly defaulting to 0. Now it's calculated as prompt_tokens + completion_tokens when not explicitly provided. This prevents silent bad values from going out over the OpenAI-compatible API. - Modified cli/serve/app.py to calculate total_tokens when missing - Updated test_usage_with_partial_data to expect correct behavior Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Replace hardcoded length assertion with implementation-agnostic check. The test now validates that the ID has the correct prefix and a non-empty suffix, without coupling to specific length or format details. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

markstur · 2026-04-01T21:43:43Z

patch unused import will fail ruff check.

I never got that error in precommit. 🤷‍♂️

Updated with fixes for the comments

ajbozarth

Claude and I agree this LGTM, might want one more approval before merge for sanity

jakelorocco

lgtm; thank you

markstur requested a review from a team as a code owner April 1, 2026 04:04

github-actions Bot added the enhancement New feature or request label Apr 1, 2026

ajbozarth reviewed Apr 1, 2026

View reviewed changes

Comment thread test/cli/test_serve.py Outdated

Remove made with Bob comment

97356b0

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

markstur added 4 commits April 1, 2026 13:39

fix: remove unused import

43b9656

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

markstur requested a review from ajbozarth April 1, 2026 21:43

ajbozarth approved these changes Apr 1, 2026

View reviewed changes

jakelorocco approved these changes Apr 2, 2026

View reviewed changes

ajbozarth added this pull request to the merge queue Apr 2, 2026

Merged via the queue into generative-computing:main with commit c887d60 Apr 2, 2026
6 checks passed

planetf1 mentioned this pull request Apr 7, 2026

Create OpenAI API-compatible HTTP interface for mellea #521

Open

29 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(serve): improve OpenAI API compatibility with usage, finish_reas…#771

feat(serve): improve OpenAI API compatibility with usage, finish_reas…#771
ajbozarth merged 6 commits into
generative-computing:mainfrom
markstur:serve

markstur commented Apr 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

Uh oh!

ajbozarth commented Apr 1, 2026

Uh oh!

markstur commented Apr 1, 2026

Uh oh!

ajbozarth left a comment

Uh oh!

jakelorocco left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

markstur commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

Uh oh!

ajbozarth commented Apr 1, 2026

Issues

Uh oh!

markstur commented Apr 1, 2026

Uh oh!

ajbozarth left a comment

Choose a reason for hiding this comment

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

markstur commented Apr 1, 2026 •

edited

Loading